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We extend in two directions our previous results about the sam¬ 
pling and the empirical measures of immortal branching Markov pro¬ 
cesses. Direct applications to molecular biology are rigorous estimates 
of the mutation rates of polymerase chain reactions from uniform 
samples of the population after the reaction. First, we consider nonho- 
mogeneous processes, which are more adapted to real reactions. Sec¬ 
ond, recalling that the first moment estimator is analytically known 
only in the infinite population limit, we provide rigorous confidence 
intervals for this estimator that are valid for any finite population. 

Our bounds are explicit, nonasymptotic and valid for a wide class 
of nonhomogeneous branching Markov processes that we describe in 
detail. In the setting of polymerase chain reactions, our results imply 
that enlarging the size of the sample becomes useless for surpris¬ 
ingly small sizes. Establishing confidence intervals requires precise 
estimates of the second moment of random samples. The proof of 
these estimates is more involved than the proofs that allowed us, in 
a previous paper, to deal with the first moment. On the other hand, 
our method uses various, seemingly new, monotonicity properties of 
the harmonic moments of sums of exchangeable random variables. 

Introduction. The incomplete replications of DNA sequences and their 
mutations that occur during successive cycles of a biochemical reaction called 
the polymerase chain reaction (PCR) can be modeled, under various simpli¬ 
fying hypotheses, by a branching process with a suitable branching mech¬ 
anism; see Sun (1995) and Weiss and von Haeseler (1995). Sun proposed a 
point estimator of the mutation rate of homogeneous reactions that is valid, 
in fact, in the infinitely-many-sites and infinite-population limits. Sun’s es¬ 
timator is based on the first moment method and was adapted by Wang, 
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Zhang, Cheng and Sun (2000) to the finitely-many-sites case, still for the 
infinite-population limit of homogeneous reactions. In Piau (2002, 2004a), 
we showed that the branching process introduced by these authors was but 
an example of a wider class of processes that we called immortal branching 
Markov processes. We studied in-depth properties of these processes, espe¬ 
cially in the case of polymerase chain reactions. Thus, we provided explicit 
bounds of the discrepancy between the point estimator of a finite-population 
homogeneous reaction and its infinite-population limit, in cases of both in¬ 
finitely many sites and finitely many sites. 

In this paper, we refine our methods and adapt them to nonhomogeneous 
reactions. This provides confidence intervals for the point estimator of the 
mutation rate. Also, we apply our results to a published data set and we 
comment on some estimation aspects of the model. For the sake of simplic¬ 
ity, we restrict the exposition to the so-called additive model, that is, to 
the infinitely-many-sites case, although similar results hold in the finitely- 
many-sites case. Finally, we show that our techniques allow us to deal with 
more general branching Markov processes. We explain in detail how to get 
pointwise estimates in this wider context and we leave as straightforward 
extensions the computation of confidence intervals. 

Coming back to the molecular biology context, the first consequence of 
our results is that Sun’s first moment method, supplemented by the cor¬ 
rection that the finiteness of the initial population induces and by explicit 
confidence intervals, is also available for PCR with variable efficiencies. This 
provides an alternative to the estimation of the mutation rate through Monte 
Carlo simulations based on the properties of the coalescent that was pro¬ 
posed by Weiss and von Haeseler (1997). To our knowledge, our results are 
the first rigorous results that deal with nonhomogeneous reactions for finite 
populations. Second, we exhibit realistic efficiency sequences such that the 
finite-population correction is significant: In one case, we are able to show 
that the correct estimator is more than 33% and less than 63% higher than 
its infinite-population approximation for every sample. Conversely, we prove 
that the finite-population correction is negligible as soon as the parameters 
fulfill a simple condition. Third, we show that, for finite populations, the 
first moment method yields an estimator that is not consistent, that is, 
whose variance does not converge to zero when the size of the sample goes 
to infinity. Thus, poor confidence intervals are an intrinsic feature of this 
setting. 

In actual reactions, the efficiency decreases along the successive cycles of 
the reaction (see Section 1 for a definition of the efficiency of the reaction or, 
more precisely, of a cycle of the reaction). The reduced sterical accessibility 
to the DNA sequences when the population is large is among several plausi¬ 
ble biochemical reasons for this phenomenon. This shows that the efficiency 
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of a cycle should be random and depend on the size of the population be¬ 
fore that cycle. We present some extensions of our results to this setting. In 
particular, Schnell and Mendoza (1997) suggested that the kinetics of PCR 
reactions follow a Michaelis-Menten law. That is, the efficiency of the 
nth cycle depends on the population Sn-i before the nth cycle, with 

(1) Xn = D/iC + Sn-l). 

Here C denotes the (usually quite large) Michaelis-Menten constant of the 
reaction, D is of the order of magnitude of C and D < C + Sq. (Schnell 
and Mendoza suggested choosing D = (7 + 1 so as to get Ai = 1 if = 1, 
as the greatest available efficiency.) When the initial population Sq is such 
that So C, this allows us to recover the initial exponential growth phase, 
followed by a linear increase of the number of molecules; see Jagers and 
Klebaner (2003). Also note that Michaelis-Menten kinetics imply that when 
So —> cx), the largest value of the sequence (A^), namely Ai = D/{C + Sq)^ 
converges to zero. In other words, the underlying branching process becomes 
critical in the So ^ oo limit. 

Point estimators and confidence intervals are consequences of precise 
bounds of the mean and the variance of a uniform sample. In turn, these 
follow from the study of the empirical measure of the population. Our meth¬ 
ods ultimately rely on rather sharp bounds of the harmonic means of sums 
of i.i.d. or exchangeable random variables. Thus, on our way, we state and 
prove various new results about these means that are often valid in a broader 
context and, in particular, some simple monotonicity properties that seem 
to have been unnoticed until now. 

The model of the PCR by a branching process is in Section 1, as well as 
a sample of the results of the paper. Some notation used in the paper are 
collected in Section 2. Theoretical results on the moments of samples are in 
Section 4. These follow from the results on empirical measures of Section 3. 
Uniform bounds are available even for random efficiencies, as explained in 
Section 5, and for a much more general model of branching processes, as 
explained in Section 6. Consequences with regard to the estimation of mu¬ 
tation rates are described in Section 7. In Section 8, we apply the method to 
the published data set used by Weiss and von Haeseler (1997). Some com¬ 
ments about the estimation of the efficiencies are in Section 9. Proofs are 
mainly deferred to Sections 10-13. 

1. Model of the PCR. The PCR is modeled by a nondecreasing Galton- 
Watson process {Sn) that starts from Sq > 1 particles with a Bernoulli re¬ 
production. We call {Sn) a Bernoulli branching process. More precisely, each 
particle x gives birth to Lj, = 1 or La, = 2 descendants independently of the 
other particles and with distribution 

(2) P(La; = 2):=A=:l-P(La; = l). 
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Each particle represents a single stranded molecule that comprises the region 
targeted at by the PCR or represents its complement on the other strand of 
the original duplex DNA molecule. Thus, the branching process counts the 
number Sn of successfully replicated biological sequences after n cycles of 
the reaction. Mutations are described by the states s(x) of the particles x 
as follows. Assume that the states of the Sq initial particles are given. For 
any particle x of any generation, the first descendant of x is a; itself and it 
has the same state s{x). If the other descendant y exists, its state s{y) is the 
result of the application of a given Markov kernel to s{x). In the additive 
model, the states are real numbers and the kernel is given by 

(3) s{y) := s{x) + iy, E(^y) =:/z, Y{iy)=:u, 

where all the random variables are independent, E denotes the expectation 
and V denotes the variance. In the PCR context, the law of is usually 
Poisson, so that v = y. In any case, A is the efficiency of the PCR (in its 
exponential phase) and y, is the mutation rate (per cycle and per particle). 
From now on, we assume that the initial population is homogeneous and we 
set s(x) = 0 for any x in the initial population. Thus, in the PCR context, 
s(x) is a nonnegative integer for any x after any number of cycles. Note, 
however, that our results are valid in the full generality of the additive 
model, as described above. 

In actual PCR, the efficiency A is not constant, but typically decreases 
to zero along the successive cycles of the reaction. To take into account this 
nonhomogeneity and the possibility of nonhomogeneous mutation rates, we 
choose two sequences (A^) and (fin) indexed by n > 1, and we replace A and y 
in (2) and (3) by An and yn when we construct the nth generation from the 
S'n-i particles of the (n — l)th generation. 

In some versions of PCR, the quantification of the product is done at 
the end of the reaction or only after a given number of cycles. By con¬ 
trast, real-time PCR, also called quantitative PCR, allows us to measure 
the amount of product after each cycle. Based on fluorescent detection sys¬ 
tems, this technology yields amplification plots that represent the accumu¬ 
lation of product during the successive cycles of the reaction; see Higuchi, 
Bollinger, Walsh and Griffith (1992) and Higuchi, Fockler, Bollinger and 
Watson (1993). Hence, we consider from now on that (A^) is known. On the 
other hand, very little seems to be known about the evolution, if any, of the 
mutation rate during the reaction. We assume that yn = y for every n and 
we seek to estimate the value of y. Let {xi, ..., xg} denote a uniform sample 
of size I drawn with replacement from the population after n cycles and let 
t denote its mean state, that is, 

i 

i=l 


PCR CONFIDENCE INTERVALS 


5 


Because the law of t is unknown, we have to rely on Bienayme-Chebyshev 
bounds, which state that t is in the interval bounded by 

E(t)±z-y^ 

with probability at least 1 — Xjz^. This supposes known values of E(t) and 
V(t). Although there exists no closed form of E(t) and V(t) that would be 
valid for every number n of generations, we can show that these quantities 
converge when Sq ^ oo and can compute the exact values of their limits, 
which we denote by E(t*) and and call infinite-population limits. The 

task is then to bound the discrepancies between the hnite-population mo¬ 
ments and their infinite-population limits. This involves the empirical laws 
of the population, which are defined in Section 2 and studied in Section 3. 
To present a flavor of the results we are aiming at, we introduce 


m 


:=/^E 




^ 1 -|- Afc ’ 


a 




Afc 




Then, we prove that m — e(S'o)// < E(t) < m with 


e(5o) := l/(5o - 1) if So > 2, e(l) := 3/2. 


Likewise, for any i>3, 

a^li < Y{t) < a^/i + (1 - l/e)r]{So){i^ + fi^) 

with r]{So) ■= 2/(S'o — 1) if So > 2 and r/(l) := 6. Specializations of the above 
are the easier to establish equalities 

E{t*) = m, Y{t*)=a^/i. 


Finally, we mention briefly that all these bounds on the discrepancy between 
the moments and the distributions of t and t* are of the right order. For 
instance, there exists an absolute constant c such that, for any i>3 and 
So>l, 

Y{t) > jl Y c{v + n‘^)/So. 


An unexpected consequence is that enlarging the size of the sample becomes 
useless for surprisingly small sample sizes; more precisely, as soon as the 
deviation, which behaves like l/5o, becomes the main contribution to Y(t), 
instead of ji. Thus, n*So is useless, where n* describes the behavior 
of (7^. We can choose n* := n for homogeneous reactions and n* := Ai -|- 
• • • An otherwise. We recall that the expected population at time n is 
(1 -|- Ai) • • • (1 -|- An)5'o, which can be much greater than n*So- 
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2. Notation. Call Cn the empirical law of the state of a particle drawn 
uniformly at random from the population at time n and let 77 ^ :=E(^„). 
That is, Cn and rjn are measures such that, for any nonnegative ip, 

s„ 

Cn{p>) ■= S~^ Vn{p>) ■= IE(Cn(¥^)). 

X=1 

By an abuse of notation, we denote the sum over the population at time n 
by a sum from x = 1 to x = Sn- For any measure r], M(r 7 ) and M 2 (r 7 ) denote 
the first and the second moments of r], and B(t 7 ) denotes its variance, that 
is, 

:= j sdrj{s), M 2 (r/) := J s‘^dT]{s), :=M 2 {r]) — M^r])"^ . 

In the next sections, some technical lemmas are valid in the broader setting 
of a general nondecreasing branching process and even for the harmonic 
moments of sums of i.i.d. or exchangeable random variables. 

Definition 1. Let Lj denote i.i.d. or exchangeable copies of a square 
integrable random variable L such that L > 1 a.s. and 

Affc := Li + • • • + Lfc. 

For any k>l, define 

H{k):=E{k/Mk), A{k):=H{k)-l/E{L), G{k) ■.= E{{k/Mkf). 

For any k>l, A[k) > 0. For i.i.d. sequences, A{k) —> 0 when A: —> 00 . For 
Bernoulli branching processes, that is, when the distribution of L is given 
by (2), we write A{k,\) instead of A{k) to specify the value of A. The same 
convention holds for H and G and other sequences that are defined later. 

Our results involve various parameters, functions of (A^) only, that we 
dehne below. 

Definition 2. Let := Afc/(1 + A^), 70 := 1, 7 o*^ := 1, and, for n > 1, 

n n 

ln-=W{l-Oik), 7 W := ]^(l-AfcA). 
k=l k=l 

Define IFo := 0, Wq := 0 and, for n > 1, 

n n 

Wn-.= Y.ak, <:=^afe(l-afc). 

k=l k=l 
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3. Empirical laws. From Theorem A, when S'o —> oo, rjn and Cn converge 
to a deterministic measure r/*, which is easy to describe [we omit the proof; 
see Piau (2004a)]. 

Theorem A. The law ry* coincides with the distribution of the random 
variable 


+ • • • + SnCnj 

where all the random variables and are independent, is Bernoulli 
with = 1) = afc = 1 — P(efc = 0), and follows the law used in (3) at 
the j th generation. Thus, 

n n 

M(ry*) = iikotk, = X! ^kOtk + - Ok)- 

k=l k=l 


3.1. Mean and distance in total variation. The approximations below 
stem from precise estimations of A{k) and of the harmonic moments of Sn 
that we develop later in this paper. The results of this section are adapted 
from Piau (2002, 2004a) and we omit their proofs. 

Theorem B. (i) First moment: 

n 

M(ry„) = M(ry;) - ^ /rfcAfc, := E(A(5fc-i, A^)). 

k=l 

(ii) Distance in total variation: 

n 

WVn - VnWTY < 14 := X! 

k=l 

Theorem C (Approximations). For any (Aj), Sq and k>l,Ak satisfies 
the inequalities 

(5o + l)Ak > 'jk-iOki^ — Afc)/(1 + Afc)^, 

{So — l)4fc < 7fc_iafc(l — Afc)/(1 + Afc)^, 

{So + l)Ak < 7f2iafc(l - Afc). 

This implies that 14 is bounded above and below by explicit functions 
of (/Tfc) and (Afc), divided by {So — 1) or {So + 1). We assume from now on 
that the law of ^ is constant along the generations or, more precisely, that its 
first two moments are. This is only for simplicity of notation and the reader 
should be able to guess the correct formulation of our results for variable 
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laws of ^ by analogy with the expressions in Theorem A. Thus, ^ :=E(^), 
u := V(^) and 


M(r?:)=/rW„, M{rjn)=fiWn-fiVn. 

The first assertion of Theorem C provides a lower bound of Vn for any Sq. 
The second assertion provides an upper bound of Vn for any Sq >2 that 
involves l/(5'o — 1). When S'o = 1, we should use the third assertion instead. 
In the rest of the paper, the bounds that involve l/(5'o + 1) are valid for any 
Sq and the bounds that involve l/(5o — 1) should be used only when 5o > 2. 
For instance, in Corollary 3, we should use the Vn lower bound for any Sq, 
the Vn upper bound for any 5o > 2 and the u" upper bound if S'o = 1 (for 
small values of Sq, the v'n bound is often less interesting than the v'n bound). 
These remarks apply to later results in this paper. 

Corollary 3. (i) One has Vn < (So + 1)14 < v'n 

n 

Vn '■= X] 7fc-l«A:(l — ^k )/(1 + Afc)^, 
k=l 
n 

^n:=I]7£iafc(l-Afc). 

k=l 


(ii) One has (So — 1)14 < Vn and Sol4 < v'^, with 

n 

v'n-='^^k-lO‘k{'i-->^k)- 
k=l 


Corollary 4. For any Sq > 2, 

fJ.Wn - yLVn/{SQ - 1) < M{rjn) < F^n “ AiUn/(So + 1) 


and 


\\r]n-r]nhv <Vn/{So-l). 

Remark 5. In contrast to the last statement above, we can show that, 
although (* = ry*, for any given n and (Afc), there exists a constant c such 
that 


E[||Cn-7:i|Tv]>c/v^ 


for every So > 1. We omit the proof. 
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3.2. Variance and uniform bounds of the empirical laws. We move to 
entirely new results, namely the estimation of second moments. Recall that 
we assumed for simplicity of notation that the two first moments of ^ are 
constant. Thus, 

D{p:) = nWn-p^W:,. 


Definition 6. Define Vf such that 0 < by the formula 

n 

Vf:='^A'k, A'k ■■= AI +{1 — 2ak)Ak. 

k=l 

The assertion in the definition above follows from A(-, A) < a. 

Theorem D. One has = iy(Wn — Vn) + — Vf). 

Proposition 7. There exists a constant V that depends on Sq such that 

/xWn -/xR < E(?7„) </xW„, \\r]n - rfnWTV <V, 


lyWn + - (p + i?)V < D(77„) < uWn + 

This holds with V := l/(<S'o — 1) for Sq > 2, and V := 3/2 for So = l. 

3.3. Additional term. Prom Sections 3.1 and 3.2, the two first moments 
and the distance in total variation of the empirical laws are described by Vn 
and Vf. The second moment of uniform samples involves an additional term 
Rn, defined as 

Rn := V(M(Cn)) = E(M(Cn)') - E(M(Cn))'. 

We now complete the results of Sections 3.1 and 3.2 with an in-depth study 
of Rn. Theorem E recursively describes the evolution of Rn. We control the 
terms of the recurrence in Lemmas 9 and 11 and Corollary 14, and finally 
get tractable bounds of Rn in Corollary 15. This section details the path 
that leads to these bounds of Rn, but the proofs of the steps themselves are 
postponed to Section 11. 


Definition 8. Introduce 


S(*=)-e(^). S'W:=v(T), 

Thus, B{k) = {H{k) - G{k))/k and B'{k) = G{k) - H{kf. 


Ml )■ 
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The theorem below is proved in Section 10. 


Theorem E. One has Ro = 0 and 

Rn+l =Rn + l^nBiSn)) + + E[B(Cn)S"(5„)], 

where one uses A„+i in the definition of B{Sn), B'{Sn) and B"{Sn)- 

Lemma 9 deals with the B and B' terms. The control of the B” term is 
more intricate and involves Lemma 11 and Corollary 14. 

Lemma 9. For any k> I, B{k) < h/k with b := (E(L) — 1)/E(L)^. In 
the Bernoulli case, b = a(l — a). For Bernoulli processes, B'{k) < b'/{k + 1) 
and B"{k) < b" jifk + 2), with 

b':=X, 6":=A(1-A). 


Definition 10. Let y > —1 denote a real number. Define nonnegative 
sequences C, C and C that depend on y by the following equations: 


(i) Let C(l) := 0 and, for any k>2, 
C{k) :=k‘^ik + y)E 


L1L2 


Mi{Mk + y) 


(ii) For any k>l such that k + y > 0, 


^ ’ \ MliMk+y) r ^ ^ VM 2 (M; 


-k) 

k + y) 


In the lemma below, iFn is the a-algebra generated by the n first genera¬ 
tions of the process. 

Lemma 11. Using A,i+i in the definition of the sequences C, C and C", 
we have, on the set {Sn + y > 0}, 

D(C^ \ ^ + C'{Sn)u + C'\Sn)u‘^. 

Kh’n+l + y } h>nFy 

Definition 12. For any real number y and any integer k > —y, let 

^ U 4 + ?/. 

Thus, Ho{k) = H{k). Since > k, C"{k) < C'{k) and 

{k + y)C'{k)<l-H{k). 
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On the other hand, for k>2, 
C{k)=Hy{k) 


+ {Li-L2f \ 

2 \Ml{Mk + y)) 


Thus, C{k) < Hy{k). At this point, the only additional tools that we need 
are estimations of H and Hy. These are developed in Section 12 and yield 
the next lemma. 


Lemma 13. (i) For any y > 0, C{k) < 1 — A/(y + 2). 

(ii) For y = —l and k>2, C{k) <1 — a. 

(hi) For any y > 1 — k, {k + y)C'{k) < a. 

Our next result states that Lemma 11 can be integrated to get a recursion 
of the form 



Corollary 14. For any y >0, (4) holds with Pn+i ^ — ^n+i/{y + 2,). 

If y = —I and Sq > 2, (4) holds with Pn+i ■= 1/(1 + A^+i). 

We are now in the position to estimate the three sums that the iteration 
of Theorem E yields. Assume hrst that 5o > 2. A weaker form of Lemma 9 
is that B{k) <b/{k — 1), B'{k) < h'/[k — 1) and B"{k) < h"/{k — 1). For the 
B and B' parts. Corollary 27 gives an upper bound of E[l/(S'fc — 1)]. For the 
B" part, we use the y = —1 form of Corollary 14. 

Assuming now that Sq = 1, we use the full form of Lemma 9 for B and Bk 
Corollary 27 allows us to bound E(l/S'fc) for the B term and E[l/(S'fc + 1)] 
for the B' term. For the B” term, we use the fact that B"{k) < h" jk and the 
y = 0 form of Corollary 14. This yields an upper bound of Fta of the form 

Rn < vu{n) + y^u'{n) + {u + y^)u"{n). 


Corollary 15. If Sq >2, we can choose 

n 

u{n) := ^ afc(l - afc)7fc_i/(£'o - 1), 
k=l 
n 

u'in) := ^ Afc7fc_i/(5o - 1), 

k=l 

n—1 n—1 

u"{n) := X] X] '^*+ 1(1 “ '^i+i)7i/(>S'o - !)• 
k=l i=k 
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// 'S'o > 1; we can choose 

n 

u{n) := “ «fc)7i-i/'S'o, 

k=l 

n 

u'in) := Afc7i-i/(‘S'o + 1), 

k=l 

n—l n—1 

u''{n) := ^ Xi+i{l - Ai+i)7fV5'o- 

k=i ^ ~ i=k 

Uniform bounds follow from the tricks described in Section 5. 

Corollary 16. There exist constants U, U' and U" that depend on Sq 
such that u{n) < U, u'{n) < U' and u"{n) < U”. For Sq > 2, this holds with 
U = U' = U" = I/{So - 1). For So = I, this holds with U = 2, U' = 3/2 and 
U" = 4. 

Corollary 17. For So > 2 (resp. for So = 1/, 

Rn < 2(z/ + ii^)/{So - 1) {resp. < 6z/ + ll/x^/2). 

4. Moments of uniform samples. Recall that the sample is {xi,... ,Xi}, 
that the family [^(xi)] is exchangeable, and that each s{xi) follows the law 
Pn- Thus, first taking the expectation with respect to the randomness of the 
sampling procedure, and then the expectation with respect to the branching 
process and to the mutation process (we skip the details), we get 

E{t) = M{pn), V(t) = B{pn)/i + (1 - l/mn- 

Hence, the results below are mostly corollaries to Section 3. The exception 
is Proposition 19 whose proof is in Section 11 [we omit the proof of part (iv), 
which is anecdotal]. 

Theorem F. One has E(t) = fiWn — p-Vn, where 14, is nonnegative and 
converges to 0 when So^ oo. More precisely, for any So > 2, 

Vn/{So + 1) < 14 < Vn/{So - 1) 

for a positive constant Vn, which depends on (Afc) only and is defined in 
Section 3.1. 

Theorem G. ITe have 


Y{t) = {uWn + - Zji + (1 - l/e)Rn, 
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where Zn and Rn are nonnegative and converge to 0 when Sq ^ oo. More 
precisely, 

Zn ■= I'Vn + <{1^ + tl^)Vn, Rn < rn{l' + /Sq 

for a positive constant rn that depends on (Xk) only and whose value can be 
deduced from Corollary 15. 

Corollary 18. When n 00 , E(t) 00 if and only if V(t) 00 if 
and only if (Afc) is not summable. For any {Xk), Zn and Rn are uniformly 
bounded. 

Proposition 19. (i) For any l>l, E(t) < E(t*). 

(ii) For i = I, Y{t) = B{r]n) < HVn) = V(P). 

(iii) For i>3,Y{t)>Y{t*). 

(iv) For i = 2, both situations are possible for any law of f,. That is, 
there exist generations n and efficiencies (A^) such that, for any law of f,, 
V(t) <Y{t*), respectively, Y{t) >Y{t*). 

Finally, uniform bounds hold that are valid for any (Afc). 

Proposition 20. (i) For any So > 2, 

fiWn - fi/{So - 1) < E(t) < piWn. 

For S'o = 1, p-/{So — 1) above should be replaeed by 3/i/2. 

(ii) Assume that £>3 and recall that Y{t*) = {uWn + For any 

So>2, 

Y{t*) < Y{t) < Y{t*) + (1 - l/i)2{iy + fi^)/{So - 1). 

For So = 1, 

Y{t*) < Y{t) < Y{t*) + (1 - l/i)6{n + 

(iii) Assume that i = \. Then, for any So > 2, 

Y{t*) - (p + fi^)/{So - 1) < Y{t) < Y{t*). 

For So = 1, (p + fJP)/{So — 1) above should be replaced by 3{v + /i^)/2. 

5. Random efficiencies. Assume that A„+i depends on Sn, as in the 
Michaelis-Menten setting we recalled in the Introduction, or on the full past 
Tn of the process up to time n. Perhaps surprisingly, some uniform bounds 
of the error term that we proved in the deterministic case still hold, but the 
behavior of the main term becomes somewhat unclear. In this section, we 
restrict the exposition to estimation of the first moment. Theorem H deals 
with the “error term” in the general case and Theorem I deals with the 
“main term” in the Michaelis-Menten case. 
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Theorem H. Let Wn ■= ^iWn), where Wn := ai + -h ctn is now ran¬ 

dom, and let V be defined as in Proposition 7. Then 

pLWn — fJ-V < E(t) < pLWn- 


Recall that V ~ l/5o when /So —> oo. Theorem H leaves open the question 
of the true behavior of E(f) in many interesting situations. For instance, 
the Michaelis-Menten law implies that Wn ~ uD/Sq when So —> oo, all the 
other parameters being fixed. Thus, the main term Wn and the error term 
V become of the same order. Before coming back to the Michaelis-Menten 
case, we sketch the proof of Theorem H. We hrst mention without proof the 
crucial identities 

n n 

X!^fc7fc = l-7n, ^Afc7^*li = i(l-7W). 

k=l k=l 


Sketch of the proof of Theorem H. A simple consequence of the 
monotonicity of H (see Lemma 30) is 

An+l 


E( i ^ 


s„ s 


n+1 


s„, > 


2S„ 


Taking expectations of both sides and summing over n, we get 

2 


n>0 


An+1 


< 


So' 


Likewise, for any A: > 0, 
1 


1 VSfc+i-iy-^o VS„y'-So \Sk+iJ 


So- 


Thus, if (Afc) is not summable, S^ —> oo a.s. and 


> Ve , - 

^0-1 -,^0 Un ;-So 


These bounds are tight since Sn = 2” So when A^, = 1 for every n. □ 


We now study Wn in the Michaelis-Menten case, that is, when 


Wn 



D 

D C Sfc_i 


which depends on n and (So, C, D). Easy remarks are that S„ ~ Dn almost 
surely and Wn ~ logn when n ^ oo, all the other parameters being fixed, 
and that Wn ~ uD/Sq when So ^ oo, all the other parameters being fixed. 
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Estimations for fixed values of n and Sq are as follows. Introduce the 
reduced variables 

so := So/C, b := C/D, 

and note that 6(1 + sq) > 1 since Ai = D/(C + Sq) < 1. The regime we are 
interested in is when sq is small and 6 is about 1, but the following result 
makes no such assumption. 

Theorem I. In the Michaelis-Menten case, w~ < Wn < with 
wt ■■= (2+ (26- l)/so)log(l + nso/(2 + so)), 

Wn ■= log(l + n/(l + 6(1 + So)))- 
When 6> 1, we can choose Wn ■=Wn with 

Wn ■= (2 + (26 - l)/so) log(l + nso/(26(1 + sq)^)). 

In the special case 6 = 1, we get 

log(l + n/(2 + so)) <Wn< (2 + l/so)log(l + nso/(2(l + sq)^)). 

Proof of Theorem I. The convexity of the function x>-^ 1/x yields 
Wn+l -Wn>D/{D + C + E{Sn)). 

Since E(5„+i) =E(5„) +E(5„A„+i) and SnXn+i < D, 

^{Sn) <So + nD. 

This yields Wn > C{n,b{l + sq)), where 

n 

C{n,t) :=^l/(t + A;) >log(l + n/(t+l)). 
k=l 

This proves the tc” bound. On the other hand, the concavity of the function 
x^x/{l + x) yields 

Wn+l -Wn< DE{l/Sn)/{l + {D + C')E(l/5„)). 

From Lemma 27, 

W.{l/Sn+l\En) < (1 - An+l/2)/5n. 

For Michaelis-Menten values of A^+i, this yields 

E(l/5„+i) < (1 - Il/(2C))E(l/50 + (Z1/(2C'))E(1/(C' + 5J). 

The same concavity inequality with respect to 1/Sn that we used a few lines 
above allows us to deal with the term E(1/(C' + Sn))- This yields 

E(l/5„+i)<V^(E(l/5J) 
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for the function ijj defined by 

:= (1 — l/{2b))x + l/(26)x/(l + Cx). 

It is a simple matter to show that l/'ip{x) >l/a + l/3:for any x <1, with 

a:=2/D + {2h-l)/So. 

Thus E(l/S'„) < (l/S'o) < a/(n + aS^). This yields an upper bound of 

Wn+i — Wn which reads, after some cumbersome algebra, 

Wn<sC{n,r), s:=aD, r := a{So + D + C) — 1. 

This can be written in (b, sq) terms only, as 

s = 2 + (26-l)/so, 

r = {26(1 + so)^ + 1 - 

When 6 > 1, the expression of rc* stems from 

C(n,r) < log(l + n/r), r > 2b{l + sof /sq. 

In the general case, 6 > 6o := 1/(1 + sq) and r(6, sq) > r(6o,so) since r is 
increasing in 6. Finally r(6o,so) = (2 + so)/so yields the value of , since 

C(n,r) < log(l + n/r) < log(l + n/r(6o, sq)). □ 

6. General branching processes. The results of Sections 3 and 4 can be 
extended, at a relatively low cost, to a wider context. Assume for instance 
that each particle x in the nth generation gives birth to > 1 children, 
where (Zx) is i.i.d. and each Zx is distributed like Ln+i, say. On the event 
{Zx = k}, order the k children of x from yi to r/k and decide that the 
fc-dimensional random vector ■ ■ ■ ,^{yk)) follows a given law . 

Do this independently for different particles x in the same generation and 
independently in different generations. 

The PCR model is recovered when the law of Ln is (1 — An)(5i + \n52 and 
when TTi = 6o and = 5o ® for a given distribution tt on the nonnegative 
real numbers. 

Coming back to the general setting, assume that every is integrable 
and call Zn the size biased distribution of Ln, dehned by Zn{k) := kd^, where 

a^=P(L„ = A;)/E(L„). 

Let the laws 7r{( be (square) integrable for any n > 1 and any /c > 1 in the 

support of the law of Ln- Let denote the expectation of ^(yi) H- \-^{yk) 

under Then the following analogue of Theorem A holds. 
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Theorem J. The law r/* coincides with the law of the random variable 

Ci + '-' + C, 

where are independent and distributed as follows. For any fixed n > 

1, draw k>l at random along the sized biased distribution Zn, then choose 
the index i uniformly at random in fc} and let (,* denote a copy of 

the ith marginal of Thus, for instance, 

n 

k=lj>l 

In the PCR context, the only nonzero term is = Tk and a 2 = \k/{^ + 
Afc) is afc. Thus M(ry*) is the sum of as in Theorem A. 

The next step is to estimate the discrepancy between ry* and rjn- With 
regard to first moments, their difference can now be negative or positive. 


Proposition 21. One has 

n 

k=lj>l 

where the error terms Sj, which can he positive or negative, are hounded by 
functions of j and of the reproducing laws of (Lj)j<fc. Such bounds can be 
deduced from the inequalities 

0 < nLk)e] - {j - E{Lk))E{l/Sk-i) 

< E({(j - E{Lk)f + {Sk-i - l)Y{Lk)}/Sl_,). 


Assume, additionally, that the first moment of each marginal of is 
bounded by a given number /Tq (or that \pj \ < Jpq) for every j > 1 and 
1 < A; < n. Then after some computations, Proposition 21 yields 

n 

|M(ry„) -M(ry;)| < E/^oE(l/5fc-i)E(L|)/E(Lfc)2. 

k=l 

Recalling finally that 


E{l/Sk-i) < E{l/Lk-i) • ■■E{l/L^)/So, 


we get the main result of this section. 


Theorem K. Fix n and fix some reproduction and mutation mecha¬ 
nisms for the n first generations. Assume that there exists r>l,po and Lq 
finite, such that, for every 1 <k <n and every j >1, 

\p^\<fpo, E(l2+-)<Lo. 
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Then there exists a finite constant C : = C{n, Lq) such that, for every > 1, 

n 

|M(r?„) - M(ry;)| < CfioSo\ M(ry;) = ^ E 

k=ij>i 

This holds with the (very crude) constant C :=nLo- 

We could deal with the second moment of r]n along similar lines, but we 
leave this task to the interested reader. 

7 . Estimation of mutation rates. In the rest of the paper, (A^) is a given 
deterministic sequence as in Section 5. Let fi denote the point estimator of 
yL by the first moment method, that is, the solution of the equation t = E(t) 
in the unknown //. Let denote its infinite-population limit, that is, the 
solution of t = E(t*) in the unknown /r. When the efficiency is constant, fi^ 
is the estimator due to Sun (1995). 

Corollary 22. We have fi> (1^ = tfWn- More precisely, for any So > 

1 , 

1-^<^<1_ r:=^, r":=^. 

5*0 + 1 h 5o + 1 Wn Wn 

When is the finite-population correction to E(t*) negligible? Assume that 
the n first efficiencies are greater than A*. Then (we omit the proof) 


Corollary 23. //5oninf(Afc) 3> 1, then fi^ ^ fi. 

In the opposite direction, consider the situation where n = 25, Afc := Xi/k, 
Ai := 0.25 and 5o = 1. Thus, the efficiency Xk decreases from Ai = 0.25 to 
^25 = 0.01. Then r = 0.495 and r" = 0.770, meaning that the true estimator 
fi is between 1.33 • (2^ and 1.63 • /i*. 

With regard to second moment properties, we first make the following 
remark, direct from Theorem G. 

Corollary 24. In the infinite-population limit, /I* is a consistent es¬ 
timator of y. For finite populations, fi is not a consistent estimator of y. 

Assuming that /I* ~ fi, the confidence interval of level 1 — l/z^ for /r 
corresponds to t lying in the interval bounded by 

E(C)±z-^V(C). 
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In the Poisson case, ]E(^) = /i = V(^). Thus, replacing /r in Y{t*) by its point 
estimator /!*, we get an approximative confidence interval for [i, bounded 
by the points /I* ± zd^jWni where 

Ai* := tjWn, al := (t + t^W'jWDjL 
When /r is small, and the interval is approximately bounded by 

Ai*(l ± zjyJTi), 

where (ti) is the total number of mutations in the sample. 

8 . Data set. Weiss and von Haeseler (1997) applied their coalescent 
method to the data set of Saiki et al. (1988). The efficiency sequence (A^) is 
not provided; neither is the initial population Sq. However, following Weiss 
and von Haeseler (1997), the extent of the amplification after 20, 25 and 30 
cycles [provided by Saiki et al. (1988)] allows us to compute hypothetical 
efficiencies, which are constant and equal to A during the 20 first cycles, then 
constant and equal to A' during the cycles between 21 and 25, and finally 
constant and equal to X” during the cycles between 26 and 30. Numerically, 

A = 0.872, A'= 0.743, A" = 0.146. 

For a sample of size i = 28, 17 mutations were observed. Thus, t = 17/28. We 
find W 30 = 12.085 and /i* = t/Wso = 0.05024. Furthermore, = 0.38435 
and U 30 = 0.03653. Thus, 

r (0.05032,0.05105), for Sq = 1, 

Jl G (0.05025,0.05039), for Sq = 10, 

[ (0.05024,0.05026), for So = 100. 

These intervals are much smaller than the uncertainties associated to, first, 
the fact that the efficiencies are unknown and, second, the variation that a 
difference of ±1 in the count of the observed mutations would yield. With 
regard to the efficiencies, we followed Weiss and von Haeseler (1997) and 
chose a constant value from A: = 1 to A: = 20, then from A: = 21 to A; = 25, and 
finally from A: = 26 to A; = 30. More regular variations of (A^) are possible. 

The maximum likelihood method of Weiss and von Haeseler (1997) gives 
values that are similar to ours when Sq is large and gives markedly different 
values when Sq is small, namely an estimated value of /r of 0.060 for 5o = 1. 
This suggests that the mean of the posterior distribution of /i is not the 
point where its density is maximal; in other words, that the distribution is 
far from being symmetric around its mean. 

With regard to confidence intervals, VF 3 Q = 6.755, and d* = 0.149, which 
is quite comparable to \/t/i = 0.147; see the remark at the end of Section 7. 
This yields the interval bounded by 0.05024 ± 2:0.01236. For instance, ^ G 
(0.02552,0.07496) at a level of confidence of 75%. 

With regard to the variance, Weiss and von Haeseler (1997) simulate the 
correct value of 0.012 when Sq is large and a variance of 0.016 when Sq = 1 . 
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9. On the mean of the sample. 

9.1. Boundary effects. We recover some striking features of the numeri¬ 
cal simulations in Weiss and von Haeseler (1995). Recall that the histogram 
of T]n is always to the left of the histogram of ry*, meaning that 77 * stochasti¬ 
cally dominates r]n (this is specific to the case where ^ > 0 almost surely); see 
Piau (2002). Furthermore, the gap between the two distributions decreases 
to zero when A —> 1. This last fact follows from v'^<2{l — X). 

Another property that is not visible on the simulations is that, for n fixed 
and when A —> 0, the gap goes to zero as well. Our results prove that this 
effect appears only when A is so small that nX <C 1, that is, for values of the 
efficiency that Weiss and von Haeseler did not consider in their simulations. 

9.2. First generations effect. Considering a smaller number of cycles in 
our test case in Section 7, we get similar values of the ratio r. For instance, if 
n = 5, respectively, if n = 10, then r = 0.521, respectively, r = 0.516. Roughly 
speaking, this means that the approximation fails only during the first gen¬ 
erations, that is, when is not large enough yet. 


9.3. Estimating efficiencies. For Bernoulli branching processes, the se¬ 
quence ( 7 n<S'n) is a positive martingale, bounded in (and in every L^, 
p>2). Thus, it converges, almost surely and in the mean to a random limit 
which is in (0, -|-oo) almost surely. This implies that 

5'n+l/[<S'n(l + An+l)] —> 1 a.s. 

This fact, rather than the observation that E(5n+i) = E(5n)(l + An+i), is 
the reason why Sn+i/Sn is a good estimator of (1 -|- A^+i). Additionally, 
some concentration of measure phenomena occurs; see the large deviations 
results of Athreya (1994) and Athreya and Vidyashankar (1995). 


10. On the proof of stochastic recursions. We first recall some basic 
tools. We have M(Cn) = Tn/Sn, where T„ denotes the sum of the states of 
the population at time n. For any particle x at time n, let e{x) := 1 if x 
has two children; otherwise, let e{x) := 0. Let {f{x)) denote i.i.d. random 
variables distributed like f. Then 


Tn+i = ^ s(a:)(l e(x)) -h ^(x)e(x), 5^+1 = 5 ^ 1 + ^(^)- 


X=1 


X=1 


The exchangeability of e(x) implies that 

T -l-e(x) 




•S'?!-1-1 


Fn = 1 . 
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Likewise, 


5„E 



Integrating Tn+i/Sn+i, we get 




E(M(Cn+l)|^n) = M(Cn) + ^^[l " H{Sn)]. 


Turning to the evaluation of the second moment, we use once again the 
expression of Tn+i and the exchangeability properties that arise. Separating 
carefully the square terms from the rectangular terms in and skipping 

the details, we get 

E(M(Cn+l)Vn) =M2(Cn)S"(5„) +M(Cn)'Sl(5n) 

+ 2/rM(Cn)[l - H{Sn)] + yB{Sn) + 


where the only coefficients that are not already defined are 

5i(l):=0, Bi{k):=k‘^E{LiL2/Mi), k>2, 

B2{k) ■.= E{{l-k/Muf ). 


We take the expectation of both sides and substract to this the recursion 
relationship for E(M(^„)) squared. The p terms cancel. The terms add 
to 


B2{Sn)-[l-H{Sn)? = B'{Sn). 

The V term is B{Sn)- The M 2 (Cn) and terms remain and we must 

show that the coefficients of these, namely B"{Sn) and Bi{Sn) — 1, add to 
0. Since M| is the sum of k squares L? and of k{k — 1) products LiLj for 
j, the exchangeability of (Lj) yields 

kE{{Ll - LiL2)/M|) + k^E{LiL2/Ml) = 1, 

that is, B"{k) + B\{k) = 1. This ends the proof of Theorem E. The proof of 
Lemma 11 uses similar techniques and we omit it. 


11. Additional term—proofs. We begin with the proof of Lemma 9 and 
with simple considerations about the sequences B, B' and B" that are valid 
for any distribution of L. Laplace’s method and the law of large numbers 
yield that, when A: —> oo, 


kB{k) 


E(L) - 1 


kB'{k) 


V(L) 


kB"{k) 


Y{L) 


E(L)2 ’ ^ ^ E(L)4’ ^ ^ E(L)2' 

This implies that Lemma 9 cannot hold with b < a(l — a), b' < A(l — A)/(l + 
A)"^ or b" < A(1 — A)/(l + A)^. Thus, Lemma 9 is optimal when A —> 0. 
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Proof of Lemma 9. The first assertion of Lemma 9 stems from the 
concavity of (m — k)/m? with respect to m on the interval {k,2k). The B" 
result follows from the facts that, for fc > 2, 

that Mfc _2 > k — 2 a.s. and that k/{k + 1)^ < 1/(A; + 2). We postpone (a 
stronger version of) the B' assertion to Lemma 25. □ 

Remark. In the Bernoulli case, B"{k) > (V(L)/E(L)^) ■ k/{k + 1)^. 

Remark. We can refine the uniform bounds as follows. First, < 1 
and v'^ < 3. Assuming that \k > A* for every k <n, 

Vn<l-X*- 

On the other hand, assuming that Xk < A"*" for every k <n, 

This yields lower bounds of Zn and since, for instance, 

I'Vn/iSo + l)<Zn< + n‘^)Vn/{So - 1 ). 

Lemma 25 is a key step in the proof of Lemma 26. 

Lemma 25. (i) One has G{k) < 1/(1 + A)^ + 3A{k). 

(ii) For any integer j >1, 

nik/MkY) < 1/(1 + A)^' + A{k)j{j + l)/ 2 . 

(hi) One has B'{k) < A{k){l + 3A)/(1 + A). 

(iv) One has B'\k) < h'/{k + 1), with 

b' := A(1 - A)(l + 3A)/(1 + Xf < A. 

Proof. Assertion (iv) follows from assertion (hi) and from the upper 
bound of A{k) in Corollary 34. Assertion (ih) follows from assertion (i) and 
from the fact that 

B'{k) = G{k) - H{kf < [3 - 2/(1 + A)]A(fe). 

The convexity of the function m^3k/m — k"^ I'm? on m> k implies that 

3E{k/Mk) - mk/Mk)^) > 3/(1 + A) - 1/(1 + A)^. 

Going back to the definitions, this is assertion (i). For assertion (ii), we 
use the convexity of the function (j + 2)jm^ — jk/m^~^^ on m > A; to 
perform a recursion on j. □ 
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The contribution of v, respectively, of to Rn+i — Rn is greater than 
K{B{Sn)), respectively, E(i?'(S'„)). The contribution of u, respectively, of fj?, 
to Zn+i — Zn is E,{A{Sn)), respectively, From Corollary 34, A{k) < 

a{l — A)/2. Hence, 

A'^j^i < E(H'(5„)), where A'{k) := (1 — X)A{k). 

Thus, Lemma 26 below implies that Rn > ■^n/2, that is, the i>3 point in 
Proposition 19. 

Lemma 26. For any k>l, B{k) >A{k)/2 and B'{k) >A'{k)/2. 

Proof. The definitions and two lines of algebra show that the inequality 
B{k) > A{k)/2 is equivalent to 

{k-2)A{k) + 2G{k)<2{l-a). 

From assertion (i) of Lemma 25, a sufficient condition is 

{k + A)A{k) < 2a{l — a). 

The upper bound of A(k) in Corollary 34 implies that this holds if 

(fc + 4)/(fc + 1) < 2/(1 — A^). 

This settles the A; > 2 case, since (/c + 4)/(A: + 1) <2 for k>2. Because the 
A: = 1 case is obvious, this proves that B{k) > A{k)/2 for any A; > 1. 

Our proof of B'[k) > A'{k)/2 is more intricate. According to Corollary 36 
in Section 12.4, whose notation we use from now on, it is enough to check 
that 

G{k) > H{kf + (1 - A)(l + \){H{k) - l)/2, 
that is, after some rearrangements, to check that 

(1 + A=)G,t > (3 + A=)G2 + (1 - A^)^ + 2 (g; + ^ + ^). 

First, G 2 < (1 — 2A)Gi. Furthermore, since n 2 G (0,1/4), for any k>2, 

G 3 = Gi[l + 3(A: — 2 )n 2 ]/(l + A) < Giuk, u := 3/4. 

For k>2, this yields the sufficient condition 

(1 + \^)k > (3 + A2)(1 - 2A) + u{l - A^) 

+ 2Gi + 2^((1-2A)2 + u2). 

Since Gi < A(1 — A), the sum of the three first terms on the right-hand side 
is the sum of 3 -|- u and of a polynomial in A with negative coefficients, hence 
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at most 3 + u. For any k > 4, the last term on the right-hand side is at 
most 2Gi(l -|- v?)/k‘^ < 1/16 because tt < 1, Gi < 1/4 and k>4. Hence, the 
right-hand side is at most 3-|-3/4-|-l/16<4. Since the left-hand side is at 
least k, we are done for any k> 4. 

Finally, setting D{k, A) := {2B'{k) — A'{k)){\ + A)/[A(1 — A)], we get, with 
the help of Maple® software, 

F»(1,A) = A>0, 

18F»(2, A) = 2 -h lOA - 9A^ -f A^ > 2, 

200F)(3, A) = 24 -h (1 - A)(l -h 58A - SSA^ - A^) > 24. 

Hence, B'{k) > A'{k)l2 for any k>l. □ 

12. Harmonic moments—statements. 

12.1. Method. Following the technique of Piau (2002, 2004a), we seek to 
compare A{k) with 1/k and to bound Hy(k). Iterating these bounds yields 
good estimations of the harmonic moments of Sk- 

A remarkable feature of the rescaled harmonic mean Hy{k) of the sum 
of k i.i.d. positive random variables is that, for y > 0, Hy{k) is a decreasing 
function of k for A: > 1. This very general fact seems to be unknown. It 
holds in a wider context (see Lemma 30) and we prove it by a completely 
deterministic method in Section 13.1. 

In the restricted context of the Bernoulli branching process, we uncover 
two other monotonicities. First, for y>l, H-y{k) is an increasing function of 
k for k > y. Second, a suitably normalized correction of H{k) is decreasing 
(however, see Remark in Section 12.3). Thus, Lemmas 30, 31 and 33 are 
crucial steps in our proofs. We state the following consequence. 

Corollary 27. In the Bernoulli ease, for any y > 0 and Sq > 1, 

In/ {So + y)< E[l/(5„ + y)] < + y)- 

For any Sq > y >1, 

E[l /{Sn - y)] < 7n/(5'o - y). 

As a consequence, for any So > 2, 

In/So < E(l/5„) < 7n/(5'o - 1). 

Remark 28. If So > 2, E(l/5„) is exactly of order 7 „, that is, of the 
order of 1/E(5„). If 5o = 1, our results show only that E(l/5n) is at most of 
order jn , which can be much greater than jn = 1/E(5ri). An upper bound 
of E(l/5„) of order jn indeed holds when iSq = 1, namely 

(5) E(l/5„)<7n(l + l/A;), 
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where A* is the minimum of the sequence (A^) up to time n (we omit the 
proof). This bound is optimal, up to a factor 2, since, when A^ = A is con¬ 
stant, we can show that the limit of E(l/5n)/7n is at least (1 -|- l/A)/2; see 
Piau (2004a or b). 

Equation (5) could be used to replace the products in all our upper 
bounds by 7„(1 -|- 1/A*). Recall finally that, in actual reactions, the sequence 
(Afc) is nonincreasing. Then A* = A„ and 7„(1 -|- 1/A*) = 7 „_i/A„. 

12.2. Monotonicities of . 

Definition 29. For any integer A: > 1 and real number y such that 
A: -|- y > 0, and for any nonnegative function (p, define 

We recover Hy when p is the convex and decreasing function p{x) = \/x. 
For any locally bounded p and any L > 1 of bounded support, the law of 
large numbers and an easy domination imply that H‘^{k) —> p{¥^[L]) when 
k —> oo. 


Lemma 30. (i) Assume that [Li) is exchangeable and that p is con¬ 

vex. Then Hq is nonincreasing. If furthermore (Li) is i.i.d., then HQ{k) > 
H^{cx^)=p{E[L]). 

(ii) Assume that y >0, that (Li) is exchangeable, and that p is convex 
and nonincreasing. Then is nonincreasing. If (Li) is furthermore i.i.d., 
then H^{k) > HiP{oo) = piE[L]). 

For instance, Hy decreases from Hy{l) =E[(1 + y)/{L-\-y)] to 1/E(L) for 
any y>0. For Bernoulli branching processes, we get 

1 - a < Hy{k) < 1 - X/{y 2). 

Furthermore, Hy describes one step of the evolution of 1/(<S'„ + y), since 
Hy{Sn)=E{ 5„. I a.s. 

This yields the first part of Corollary 27. 


Lemma 31. For Bernoulli branching processes and y >1, the sequence 
H-y{k) is increasing for k> y. Thus, 

H.y{k) < 1/E(L) = 1-0. 


Iterating Lemma 31 yields the second part of Corollary 27. 
Remark. For non-Bernoulli laws, H-i can fail to be increasing. 
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12.3. Monotonicity of A. Lemmas 32 and 33 are proved in Section 13.2. 


Lemma 32. For A; = 1, A(l) = E(l/L) — 1/E(L). Whenk^oo, 

kA{k)^Y{L)/E{Lf. 

Lemma 33. For Bernoulli branching processes, {k + 1)^(A:) is decreas¬ 
ing. 

For Bernoulli branching processes, A(l) = a(l —A)/2 andV(L) = A(l —A). 


Corollary 34. For Bernoulli branching processes, 

a{l — A)/(l + A)^ < (A: + l)A(A) < a{l — A). 

Corollary 35. In particular, kA{k) < a(l — A). 

Remark, (i) The sequence kA{k) is not always decreasing, even in the 
restricted case of Bernoulli branching processes. If A < 2 — we can show 
that kA{k) is in fact increasing (we omit the proof). 

(ii) In the general context, the sequence {k + l)A[k) is not always de¬ 
creasing. If the law of L is (1 —p)di -Fpds and if p < 1/16, we can show that 
{k l)A{k) is in fact increasing (we omit the proof). 

12.4. Taylor expansions. Precise estimates of the remaining terms of the 
Taylor expansions of the functions x^Xjx and x^ 1/x^ yield bounds of 
Gik) and H{k), hence of A{k). Set n 2 := A(1 — A). The following results are 
proved in Section 13.3. 

Corollary 36. One has G{k)>G{k)/{l + X)^ and H{k) < H{k)/{l-\- 
X) with 

G{k) := 1 + 3Gi/A - + 2G^/k^, 

H{k) := 1 + Gi/k - G^/k^ + Gzjk^, 

where 

^ _ n2 _re2(l-2A) _ n2(l + 3(A - 2 ) 712 ) 

(i+a) 3 ■ (TiAp ■ 

Corollaries 34 and 36 provide tight bounds of A[k). For instance, when 
A ^ 0, A{k)/\ is asymptotically between 1/A: —1/A^-|-1/A^ and 1/(A + 1). 
The difference is 1/[A:^(A: + 1)]. In the general case, the following corollary 
holds. 
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Corollary 37. For any k>l, 


A(l-A) 1 
(1 + A)3 fc + l 


< A{k) < 


A(l-A)fc + 1 
(1 + A)^ P 


Therefore, for any k>2, 


A{k) < 


A(l-A) 1 
(1 + A)3 A;-1‘ 


13. Harmonic moments—proofs. 


13.1. Proof of the monotonicities of . Surprisingly (to us), Lemmas 
30 and 31 reflect almost sure properties that have nothing to do with 
randomness. Assume first that y = 0 and note that + 1) is the 

barycenter with equal weights of the {k + 1) random variables /k, where 

U\ 

Ml := Mfc+i — Li. Thus, the convexity of ip implies that 



(i) 

By exchangeability, each Mf, is distributed like ■ Taking the expectations 
of both sides of (6) yields the result. 

For y > 0, set Nk := (M^ + y)/ {k + y) and yk := y/[k{y + A: + 1)] > 0. 
Tedious computations show that 

^k+l = —Vk + (1 + Vk) 1 \ 

K i 

where yields like Mk yields Nk- Since each > 1, A^+i is greater 

(i) 

than the barycenter of the random variables A^ ^, which are distributed like 
Nk- Since ip is nonincreasing, taking expectations yields Lemma 30. 

Likewise, the proof of Lemma 31 for Bernoulli random variables is entirely 
nonrandom. Assume that Lj = 2 for i indices j between 1 and A: + 1, and 
assume that Lj = 1 for the k-\-l — i other indices j. Then Mk+i = k + i + 

Mf: = k -\- i — 1 for i indices j and = k + i for the k + \ — i other 
indices j. Thus, it is enough to check the almost sure inequality 

A; + 1 — y ^k — yf i A; + l — 

A; + i + 1 — y “ A; + 1 VA; + i — 1 — y k + i — yj 

for any 0 < i < A: + 1 and k > y. After simplifications, this is equivalent to 
the condition that either z = 0 or z > 1 and 

(y - 1)^ + (y + i)i -1 - y^ > 0, 

which holds for any z > 1 and A: > y > 1. Thus, Lemma 31 holds. 
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13.2. Proof of the properties of A. For |t| < 1, let f{t) := E(t^) and 
g{t) :=E((Li — L2)‘^t^^^^^~^)/2. An integration by parts yields 

Mk)= [ g{t)f(t)~‘^f{t)’^dt. 

Jo 

(We omit the details.) Since \f(t)\ < 1 for \t\ < 1 and /(I) = 1, when k —> oo, 
the integral is controlled by the behavior of the integrand when t —> 1. More 
precisely, from Laplace’s method, 

A{k)^g{l)ni)-y[kf'{l)], 

where g(l) and /'(I) denote limits when t —> 1, t < 1. Since g(l) = V(L) and 
f'{t) —> E(L), Lemma 32 holds. 

For Bernoulli branching processes, g{t) = A(1 — A) is constant. Thus, A{k) 
takes the simpler form A(k) = A(1 — X)I{k, 1), where 

I{k,i):= /'(/)'=(/')-''. 

Jo 

The facts that (fc +1)/'/^ is the derivative of and that f” = 2A, together 
with an integration by parts yield the recursion 

{k + l)/(fc, £) = (! + A)-(2^+^) + 2A(2^ + l)I{k + 1,^+1). 

Since is decreasing for any i, this implies that the sequence (A; + 

l)I{k,i) is a decreasing function of k and, finally, that Lemma 33 holds. 

13.3. Proofs of Taylor expansions. Let Ti{f){xo, ■) denote the Taylor ex¬ 
pansion at xo of the function /, up to order i > 0. For instance, if h{x) := Ijx 
and g{x) := 1/x^, 

Ti{h){xo,x) = (x - xoY/xl"^^, 

j=o 

Ti{g){xo,x) = + l)(x - xoy/xl'^"^. 

j=o 

We now estimate the remaining terms, perhaps more precisely than is usual. 


Lemma 38. For any i>l, 

h{x) =T 2 i-i{h){xo,x) + {x - xof'-/{ x^qx), 

g{x) = T 2 i-i{g){xo,x) + {x - xo)^*(xo + 2ix)/(xo*+^x^). 


Proof. Recursion on A > 1. □ 
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Corollary 39. If x and xq are both in a positive interval (x_,a:+), 

h{x) < T 2 i-i{h){xQ,x) + hi{x - xof\ 

g{x) >T2i-i{g){xo,x) + gi{x - xo)‘^\ 


where 


hi .— 


2i ’ 
Xr\ X _ 


2ix+ + Xq 2i 
•= „2i+l 2 - „2i+l • 

*^Q *^0 


A similar lower bound of h{x) and a similar upper bound of g{x) hold. 


Remark. Taylor-Lagrange formulas yield greater error terms for h{x) 
and for g{x) that are, respectively, 

hi := r,.., 1 and g^ := 




* • ™2i+2 

X I 


We apply Corollary 39 to x := Mk/k, xq := 1 + A, x_ := 1 and x+ := 2, 
and i := 2. Introducing mj := E((x — xq)^) and using mi = 0, we get 

H [ k ) < 1/(1 + A) + 771 - 2/(1 d" A)^ — 777.3 /(1 + A)^ + 777.4 /(1 + A)^, 

G{k) > 1/(1 + A)^ + 37772 /(1 + A)”^ — ^im^f (1 + A)^ + 27774 /(1 + A)^. 

If Hj denotes the jth moment of L — E(L), m 2 = n 2 fk, 7773 = n^jk"^ and 

7774 = (774 + 3{k — 1 ) 772 )/^^. 

If L is Bernoulli, 772 = A(1 — A), 773 = 772(1 — 2A) and 774 = 772(1 — 3772 ). This 
yields Corollary 36. 

Proof of Corollary 37. The expression of H{k) in Corollary 36 
yields 

where, after some rearrangements, a can be written as 

- X _ ^ 1-4A(1-A) _ 2A(1-A) 

V k) 1 + A (l + A)/f 

Thus, o < 1 and the upper bound is proved. The lower bound is in Corol¬ 
lary 34. □ 
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